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Abstract 

Many  types  of  common  objects,  such  as  tools  and  vehicles,  usually  move  in  simple  ways  when 
they  are  wielded  or  driven:  The  natural  axes  of  the  object  tend  to  remain  aligned  with  the 
local  trihedron  defined  by  the  object’s  trajectory.  Based  on  this  observation  we  use  a  model 
called  Frenet-Serret  motion  which  corresponds  to  the  motion  of  a  moving  trihedron  along  a 
space  curve.  Knowing  how  the  Frenet-Serret  frame  is  changing  relative  to  the  observer  gives 
us  essential  information  for  understanding  the  object’s  motion.  This  is  illustrated  here  for  four 
examples,  involving  tools  (a  wrench  and  a  saw)  and  vehicles  (an  accelerating  van,  a  turning 
taxi). 


Keywords:  Frenet-Serret  motion,  Normal  optic  flow,  Nonlinear  least  squares,  Motion  models, 
Object  dynamics,  Object  motion,  Vehicle  motion,  Tool  motion,  Weak  perspective  projection, 
Planar  motion. 


The  support  of  the  Office  of  Naval  Research  under  Grant  N00014-95- 1-0521  is  gratefully  acknowledged. 


1  Introduction 


An  object  moves  because  it  is  self-propelled  (e.g.,  a  vehicle)  or  because  it  is  wielded  (or  thrown1) 
by  an  agent  (e.g.,  a  tool).  Motion  that  efficiently  performs  a  locomotional  or  mechanical  function 
requires  efficient  energy  transfer  from  the  vehicle’s  engine  or  the  agent’s  arm  to  the  object,  in 
order  to  efficiently  overcome  the  constraints  imposed  by  the  environment  in  which  the  motion 
takes  place  (air  resistance,  friction,  etc.).  Assuming  that  an  object  has  natural  axes  (e.g.  the 
long  axis  of  a  stick),  efficient  force  transfer  requires  simple  relationships  between  the  natural 
axes  of  the  object  and  the  motion  trajectory.  These  relationships  insure  that  the  object  can 
perform  its  function  efficiently. 

The  most  general  model  of  object  motion  is  unrestricted  rigid  motion.  This  type  of  motion 
is  not  common  in  everyday  life.  Usually  objects  are  supported,  and  motion  takes  place  when  an 
object  is  in  contact  with  a  surface,  another  object,  or  an  agent.  In  these  cases  (tool  acting  on  a 
recipient  object;  ground  vehicle)  the  motion  becomes  interestingly  constrained. 

In  our  work  we  consider  the  relationship  between  this  constrained  motion  and  the  object’s 
geometry.  To  analyze  this  relationship  we  use  two  frames:  the  object  frame  and  the  frame  of  the 
motion  trajectory.  “Efficient”  motion  calls  for  a  simple  relationship  between  the  object  frame 
and  the  motion  frame,  and  this  relationship  remains  constant  during  the  motion.  Based  on 
this  observation  we  use  a  model  called  Frenet-Serret  motion  which  corresponds  to  the  motion 
of  a  moving  trihedron  along  a  space  curve  [8].  The  parameters  of  the  motion  are  given  by  the 
curvature  and  torsion  of  the  space  curve  along  which  the  object  moves. 

In  practice  the  simple  nature  of  the  environment  in  which  the  object  moves  provides  further 
constraints.  A  ground  vehicle  is  moving  on  relatively  fiat  terrain,  and  a  tool  is  often  acting  on 
a  planar  surface.  The  motion  is  mostly  planar  (though  the  plane  might  rotate  slightly  through 
the  motion).  Over  a  long  time  period  the  motion  is  Frenet-Serret  and  over  a  short  time  period 
the  motion  is  approximately  planar  and  often  approximately  translational. 

We  use  the  relationship  between  the  object  frame  and  the  motion  frame  to  analyze  image 
sequences.  Given  a  sequence  of  images  of  the  moving  object,  our  analysis  enables  us  to  output 
the  motion  and  trajectory  parameters  of  the  object.  Knowing  how  the  Frenet-Serret  frame  is 
changing  relative  to  the  observer  gives  us  essential  information  for  understanding  the  object’s 
motion.  Our  analysis  can  also  handle  constraints  on  the  motion.  For  example,  the  parameters 
of  the  object’s  trajectory  depend  on  its  speed,  mass,  size,  and  on  the  medium  through  which  it 
moves.  These  factors  impose  bounds  on  the  curvature  and  torsion  of  the  trajectory. 

In  this  paper  we  approach  object  motion  understanding  through  analysis  of  long  image 
sequences.  A  key  question  in  this  context  is  how  to  relate  short-sequence  motion  estimation 
to  long-sequence  motion  estimation.  Using  the  Frenet-Serret  frame  provides  us  with  an  ability 
to  understand  motion  over  a  long  time  period.  We  can  derive  the  motion  parameters  from  the 
parameters  of  the  trajectory  and  obtain  motion  descriptions  suitable  for  long  sequence  analysis. 
Using  these  tools  we  can  show,  for  example,  that  rotation  becomes  significant  only  in  long 

1We  assume  in  this  paper  that  the  propulsive  force  is  applied  to  the  object  continuously,  unlike  the  case  of  a 
projectile  where  it  is  applied  only  initially.  We  will  not  discuss  projectiles  further  here. 
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sequences,  and  that  in  a  short  sequence  translation  is  usually  dominant.  We  show  that  using 
simplified  scene  and  imaging  models  we  can  get  adequate  local  estimates  (short  sequence,  2-4 
frames)  by  analyzing  the  images,  and  by  observing  these  estimates  over  a  long  sequence  we 
can  accumulate  them  to  describe  the  object’s  trajectory.  Analysis  of  the  trajectory  parameters 
provides  us  with  tools  for  understanding  long-term  object  motion. 


2  Related  Work 

Understanding  object  motion  is  based  on  extracting  the  object’s  motion  parameters  from  an 
image  sequence.  Broida  and  Chellappa  [1]  proposed  a  framework  for  motion  estimation  of  a 
vehicle  using  Kalman  filtering.  Weng  et  al.  [16]  assumed  an  object  that  possesses  an  axis  of 
symmetry,  and  a  constant  angular  momentum  model  which  constrained  the  motion  over  a  local 
frame  subsequence  to  be  a  superposition  of  precession  and  translation.  The  trajectory  of  the 
center  of  rotation  can  be  approximated  by  a  vector  polynomial.  Changing  the  parameters  of  the 
model  with  time  allows  adaptation  to  long-term  changes  in  the  motion  characteristics.  Their 
work  was  based  on  correspondence;  at  least  eight  pairs  of  corresponding  points  were  needed. 

Accumulating  the  information  obtained  from  the  motion  analysis  of  the  sequence  to  achieve 
an  estimate  of  the  moving  object’s  trajectory  is  another  step  toward  understanding  object  mo¬ 
tion.  (A  good  survey  of  motion-based  recognition  was  compiled  by  Cedras  and  Shah  [5].) 
Bruckstein  et  al.  [2,  3]  assumed  a  known  object  model  (a  rigid  rod  or  disk)  and  tried  to  recover 
the  object’s  trajectory  and  rotation.  They  showed  that  five  images  are  enough  to  recover  the 
motion  of  a  rod  or  a  disk  in  accordance  with  physical  laws.  Techniques  from  algebraic  geometry 
were  used  to  establish  the  existence  of  solutions  to  the  resulting  polynomial  equations. 

Engel  and  Rubin  [9]  (and  similarly  Gould  and  Shah  [11])  used  motion  characteristics  obtained 
by  tracking  representative  points  on  an  object  to  identify  important  events  corresponding  to 
changes  in  direction,  speed  and  acceleration  in  the  object’s  motion. 

Work  has  also  been  done  on  higher-level  descriptions  of  object  trajectories  in  terms  of  such 
concepts  as  stopping/starting,  object  interactions,  and  motion  verbs[4,  12,  13].  This  level  of 
object  motion  description  will  not  be  treated  in  this  paper. 

In  [6]  Duric  et  al.  tried  to  determine  the  function  of  an  object  from  its  motion.  Given  a 
sequence  of  images  of  a  known  object  performing  some  function,  they  attempted  to  determine 
what  that  function  was.  They  showed  that  the  motion  of  an  object,  when  combined  with 
information  about  the  object  and  its  uses,  provides  strong  constraints  on  the  possible  function 
being  performed.  Their  flow-based  analysis  treated  relatively  short  sequences. 

In  this  paper  a  model  for  object  trajectory  analysis  is  used,  and  a  constant  relationship 
between  the  object  frame  and  the  motion  frame  is  established.  The  use  of  the  Frenet-Serret 
frame  provides  a  vocabulary  appropriate  for  describing  longer  motion  sequences. 
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3  Motion  Models 


3.1  Rigid  Body  Motion 


To  facilitate  the  derivation  of  the  motion  equations  of  a  rigid  body  B  we  use  two  rectangular 
coordinate  frames,  one  ( Oxyz )  fixed  in  space,  the  other  (Cxiyxzi)  fixed  in  the  body  and  moving 
with  it.  The  coordinates  Xx,  Yx,  Zx  of  any  point  P  of  the  body  with  respect  to  the  moving  frame 
are  constant  with  respect  to  time  t ,  while  the  coordinates  X ,  Y .  Z  of  the  same  point  P  with 
respect  to  the  fixed  frame  are  functions  of  t.  It  is  assumed  that  these  functions  are  differentiable 
with  respect  to  t.  The  position  of  the  moving  frame  at  any  instant  is  given  by  the  position 
dc  =  (Xc  Yc  ZC)T  of  the  origin  C,  and  by  the  nine  direction  cosines  of  the  axes  of  the  moving 
frame  with  respect  to  the  fixed  frame.  Let  z,  j,  and  k  be  the  unit  vectors  in  the  directions  of  the 
Ox,  Oy,  and  Oz  axes,  respectively;  and  let  fi,  jx,  and  kx  be  the  unit  vectors  in  the  directions  of 
the  Cx i,  Cyi,  and  Czx  axes,  respectively.  For  a  given  position  p  of  P  in  Cxxyxzx  we  have  the 
position  rp  of  P  in  Oxyz: 
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where  R  is  the  matrix  of  the  direction  cosines  (the  frames  are  taken  as  right-handed  so  that 
det  R  =  1).  If  we  differentiate  (1)  with  respect  to  time  and  use  the  fact  that  p  —  RT(rp  —  dc), 
we  obtain  .  ^ 

rp  =  Rp  +  dc  =  RRT  ( rp  —  dc)  +  dc  =  Cl(rp  —  dc)  +  dc.  (2) 

The  skew  matrix  0,  =  RRT  =  —RRT  is  the  rotational  velocity  matrix  and  dc  is  the  translational 
velocity  vector.  Multiplying  a  vector  (rp  -  dc)  by  the  skew  matrix  0  can  be  replaced  by  taking 
the  cross  product  ui  x  {rp  —  dc )  where  lu  =  (ujx  ujv  loz)t  is  the  rotational  velocity  vector. 


3.2  Motion  along  a  Smooth  Curve 

Consider  a  moving  frame  Cxxy\Zi  (fixed  in  a  rigid  body  B),  which  moves  with  C  along 
a  space  curve  T  while  rotating  so  that  the  Cx \  and  Cyx  axes  concide  with,  respectively,  the 
tangent  and  principal  normal  of  F.  This  means  that  as  C  moves  along  T  the  Cxxyxzx  frame 
concides  with  the  Frenet-Serret  trihedron  at  C:  Ctnb.  This  trihedron  consists  of  the  tangent  t, 
the  principal  normal  ri,  and  the  binormal  b,  which  are  mutually  orthogonal  (see  Figure  1).  The 
geometry  of  this  motion  is  completely  defined  by  F. 

Let  d^(s)  denote  the  position  of  C,  in  the  fixed  coordinate  frame  Oxyz,  when  it  has  moved 
along  T  through  a  total  arc  length  of  s.  For  any  position  p  of  a  point  P  on  B  in  Ctnb,  the 
position  rp  in  Oxyz  is  given  by  (1)  with  the  matrix  of  direction  cosines  R  suitably  defined  (see 
Figure  1).  If  t  =  (tx  t2  t3)T,  n  =  (ni  n2  n3)T  and  b  =  ( bx  b2  b3)T  are  the  unit  vectors  along 
Ct,  Cn  and  Cb,  differential  geometry  gives  us 

t  =  d!y,  n  =  K~ld",  b  =  t  x  n,  (3) 
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Figure  1:  The  Frenet-Serret  coordinate  frame  moves  along  the  path  T. 
where  k  is  the  curvature  of  F.  Then  we  have 
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/ 

We  have  the  Frenet-Serret  formulas  [14] 

t  =  «:n.  n/  =  — «t  +  rb,  b  =  — rri 

where  r  is  the  torsion  of  T.  Using  (4)  and  (5),  equation  (2)  can  be  written  as 


(4) 

(5) 


fp  LOd  X  (Vp  d-y)  ~f"  t  (0) 

where  the  Darboux  vector  ujd  =  rt+/cb  is  the  rotational  velocity  vector  and  the  unit  tangent  t  of 
T  is  the  translational  velocity  vector;  the  motion  parameter  is  the  arc  length  s.  If,  instead  of  using 
arc  length  as  a  motion  parameter,  time  t  is  used,  the  rotational  velocity  ud  and  translational 
velocity  t  are  scaled  by  the  speed  v  =  ds/dt  of  the  point  C .  In  that  case  the  equation  of  motion 
becomes 

rv  =  vud  x  ( rp  -  dy)  +  vt.  (7) 


In  the  special  case  where  T  is  a  plane  curve  we  have  t  =  0  (T  is  torsionless),  and  thus 
ud  =  k\>.  Equation  (7)  then  becomes 


rv  =  u/cb  x  (rp  —  d7)  +  vt. 


(8) 


3.3  Simple  Motions  of  Objects 

Objects  move  in  reaction  to  forces  which_are  being  applied  to  them.  When  the  forces  acting  on 
an  object  are  added,  the  resultant  force  F  determines  the  direction  of  motion  and  the  moments 
of  the  forces  (or  the  torques)  determine  the  rotation  of  the  object.  If  the  force  F  is  applied  to 
the  object  B  at  the  point  P .  the  moment  M  is  given  by  M  =  rp  x  F  where  rp  is  the  position  of 


4 


P  relative  to  a  point  C .  M  has  the  same  direction  as  the  axis  of  the  rotation  of  B  that  results 
from  applying  F. 

The  engine  of  a  vehicle  needs  to  apply  force  to  the  vehicle  in  order  to  move  it  from  one 
position  to  another.  If  the  path  is  prespecified  (as  in  the  case  of  a  ground  vehicle  on  a  road), 
efficient  application  of  the  force  requires  that  the  angles  between  the  instantaneous  directions 
of  the  force  and  the  directions  of  the  path  elements  be  small.  The  force  differential  generates 
torques  which  help  turn  the  vehicle  around  the  axis  of  rotation  normal  to  the  (osculating)  plane 
of  the  path.  During  a  turn,  the  wheels  rotate  with  different  speeds;  the  greater  the  distance 
between  the  wheels  the  larger  their  difference  in  speed.  In  order  to  minimize  this  difference  the 
distance  between  the  wheels  needs  to  be  small.  Also,  when  forces  are  applied  to  the  wheels  the 
resulting  torques  are  larger  when  the  vehicle  is  moving  along  a  short  axis;  but  these  torques 
need  to  be  as  small  as  possible  to  improve  the  handling  of  and  minimize  stresses  on  the  vehicle. 
Because  of  all  these  factors  the  principal  axis  of  inertia  of  the  vehicle  should  be  tangent  to  the 
path  of  the  vehicle.  It  should  be  pointed  out  that  [7]  the  translational  velocity  at  any  point  on 
a  ground  vehicle  is  typically  orders  of  magnitude  larger  than  its  rotational  velocity  (around  the 
vehicle’s  center  of  mass).  The  rotational  velocity  becomes  significant  only  when  the  vehicle  is 
observed  over  a  significant  period  of  time  (at  least  several  frames). 

In  the  case  of  a  moving  tool  the  force  is  used  not  only  to  move  the  tool,  but  to  act  on 
a  recipient  object.  Therefore,  the  required  force  depends  on  the  task.  For  example,  sawing 
involves  continuously  exerting  a  force  perpendicular  to  the  path  of  the  saw;  tightening  with  a 
wrench  involves  continuously  exerting  torque  around  the  axis  of  rotation.  (Note  that  the  force 
may  not  be  applied  to  the  recipient  object  continuously;  for  example,  when  we  swing  a  hammer, 
the  force  is  applied  only  when  the  head  of  the  hammer  hits  the  object.)  Developing  a  general 
theory  of  tool  motion  is  a  subject  of  our  continuing  research. 


4  Computing  Motion  from  Image  Sequences 


For  the  purpose  of  estimating  object  motion  from  images  we  rewrite  equation  (2)  in  the  following 
way: 

rp  =  u  x  (rp  —  dc)  +  dc  —  uj  x  rp  +  f  (9) 

— #  —t  — * 

where  T  =  dc—u  x  dc  =  (U  V  W)T  is  the  translational  velocity  expressed  in  the  fixed  (camera) 

coordinate  frame  Oxyz.  We  will  later  show  how  the  translational  velocity  dc  can  be  recovered 
from  T. 


4.1  The  Imaging  Models 

Let  (AT,  y,  Z)  denote  the  Cartesian  coordinates  of  a  scene  point  with  respect  to  the  fixed  camera 
frame  (see  Figure  2),  and  let  (x,  y)  denote  the  corresponding  coordinates  in  the  image  plane. 
The  equation  of  the  image  plane  is  Z  =  /,  where  /  is  the  focal  length  of  the  camera.  The 
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(10) 


perspective  projection  onto  this  plane  is  given  by 


For  weak  perspective  projection  we  need  a  reference  point  ( Xc ,  Yc,  Zc).  A  scene  point  (X,  Y,  Z ) 
is  first  projected  onto  the  point  (X,  Y,  Zc );  then,  through  plane  perspective  projection,  the  point 
(X,Y,ZC)  is  projected  onto  the  image  point  (x,  y).  The  projection  equations  are  then  given  by 


Xf  Y 

x  -  -z-f,  y  = 

ZJq  & c 


(11) 


Figure  2:  The  plane  perspective  projection  image  of  P  is  jF  =  f(X/Z,  Y/Z,  1);  the  weak  per¬ 
spective  projection  image  of  P  is  obtained  through  the  plane  perspective  projection  of  the 
intermediate  point  Pi  —  ( X ,  Y,  Zc)  and  is  given  by  G  =  f(X/Zc ,  Y/Zc,  1). 


4.2  The  Image  Motion  Field  and  the  Optical  Flow  Field 


The  instantaneous  velocity  of  the  image  point  (x,  y)  under  perspective  projection  is  obtained  by 
taking  the  derivatives  of  (10)  and  using  (9): 


x  = 
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(12) 

(13) 


The  instantaneous  velocity  of  the  image  point  ( x,y )  under  weak  perspective  projection  can 
be  obtained  by  taking  derivatives  of  (11)  with  respect  to  time  and  using  (9): 


-xzc 

Uf-xW 

,  z 

(14) 

X  =  f 

z2 

Zc 

+  fUy  Y  ~  UzV' 

,Y  Zc 

-  YZr. 

Vf-yW 

,  z 

(15) 

y  =  f 

II 

C4  C 

N 

zc 

-  fu>x—+UzX. 

C 

6 


Let  i  and  f  be  the  unit  vectors  in  the  x  and  y  directions,  respectively;  r  =  it  +  yj  is  the 
projected  motion  field  at  the  point  r  =  xT+  yj.  If  we  choose  a  unit  direction  vector  nT  at  the 
image  point  r  and  call  it  the  normal  direction,  then  the  normal  motion  field  at  r  is  rn  =  (r-nT)nr. 
nr  can  be  chosen  in  various  ways;  the  usual  choice  (as  we  shall  now  see)  is  the  direction  of  the 
image  intensity  gradient. 

Let  I(x,  y,  t )  be  the  image  intensity  function.  The  time  derivative  of  I  can  be  written  as 


77 

dt 


dl  dx  dl  dy 
dx  dt  dy  dt 


+  ~qI  =  (J*»  +  IyJ)  •  (**  +  yj)  +  It  =  V7  •  f  +  It 


where  VI  is  the  image  gradient  and  the  subscripts  denote  partial  derivatives. 

If  we  assume  dl /dt  =  0,  i.e.  that  the  image  intensity  does  not  vary  with  time,  then  we  have 
V7  •  u  +  It  =  0.  The  vector  field  u  in  this  expression  is  called  the  optical  flow.  If  we  choose  the 
normal  direction  nr  to  be  the  image  gradient  direction,  i.e.  nr  =  V7/||V7||,  we  then  have 


Un  —  (u  — 


-7tV7 

l|V7|P 


(16) 


where  un  is  called  the  normal  flow. 

It  was  shown  in  [15]  that  the  magnitude  of  the  difference  between  un  and  the  normal  motion 
field  rn  is  inversely  proportional  to  the  magnitude  of  the  image  gradient.  Hence  rn  ~  un  when 
||  V7|[  is  large.  Equation  (16)  thus  provides  an  approximate  relationship  between  the  3-D  motion 
and  the  image  derivatives.  We  will  use  this  approximation  later  in  this  paper. 


5  Tool  Motion 

We  assume  that  the  tool  is  (approximately)  planar  and  that  its  velocity  is  composed  of  a  trans¬ 
lational  velocity  in  the  plane  of  the  tool  and  a  rotational  velocity  around  an  axis  orthogonal  to 
the  plane  of  the  tool. 


5.1  The  Image  Motion  Field  of  a  Wielded  Tool 


Let  the  normal  to  the  plane  be  iV  =  (Nx  Ny  NZ)T]  the  equation  of  the  plane  orthogonal  to  N 
which  passes  through  the  point  (0, 0,  Zq)  on  the  z-axis  of  the  Oxyz  coordinate  frame  is  given  by 

XNx  +  YNy  +  (Z-  Z0)Nz  =  0.  (17) 


If  we  assume  a  nondegenerate  view  (i.e.,  Nz  >  0)  for  points  on  the  plane  we  obtain  from  (17) 
and  (10) 

-  =  —  +  +  =  _L  (f  +  px  +  qy)  (18) 

Z  fZ0V  J  ZNz  J  ZNzJ  fZ0KJ  P  qU)  K  ’ 
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where  p  =  NXNZ  1  and  q  =  NyNz  x.  From  our  assumption  about  rotational  velocity  it  follows 
that  we  have  Co  =  (pcoz  qioz  ioz)  for  some  u>z.  Also,  since  we  have  assumed  that  the  translation 
is  in  the  plane  of  the  tool  we  have  N  ■  T  —  0,  or  equivalently 

(p  q  l)r  •  (U  V  Wf  =  Up  +  Vq+W  =  0. 

It  follows  that  we  have 

W  =  -Up  -  Vq.  (19) 

From  (12-13),  (18),  and  (19)  we  obtain  the  equations  of  projected  motion  for  points  on  the 
plane: 

i  =  —  -  - J X^(f  +  px  +  qy)  -  (j  +  /)~  (20) 

V  =  n±yVi±yv1(f + „ + w)  -  ^  + /)  +  ^  (21) 

Equations  (20-21)  relate  the  image  (projected)  motion  field  to  the  scaled  components  of  the 
translational  velocity  Zq1U  =  Uq  and  Zq1V  =  Vo,  the  rotational  parameter  u ;z,  and  the  normal 
to  the  plane  (p  q  1)T. 

Given  the  point  r  =  +  yj  and  the  normal  direction  nxi  +  nyj,  from  (20-21)  the  normal 

motion  field  rn  •  n  =  nxx  +  nyy  is  given  by 

rn  ■  n  =  U0(f  +  px  +  qy)[nx  +  (xnx  +  yny)pf~1]  +  V0(f  +  px  +  qy)[ny  +  (xnx  +  yn^qf'1] 
+uz[nx(-y  +  qf  -  +  gx2/-1)  +  ny(®  -  pf  +  qxyf~x  -  py2f~X )] 

=  Uo<Pi(p,q-,r,n)  +  Vo<p2(p,q;r,n)  +  uz<p3(p,q;r,n)  (22) 

where  the  <ps  are  nonlinear  functions  of  p,  q,  f,  and  n  is  given  by 

<^i(p,^;r,n)  =  (/  +  px  +  qy)[nx  +  (xnx  +  yny)pf~1],  (23) 

<p2{p,q-,r,n)  =  (/  +  px  +  qy)[ny  +  (xnx  +  yny)qf-x],  (24) 

V?3(p,?;r,n)  =  nx(-y  +  qf-pxyf-1+qx2f-1)  +  ny(x-pf  +  qxyf-1-py2f~1).  (25) 

In  Equation  (22)  r  and  n  are  observable  from  images,  while  the  5-tuple  (p,q,Uo,Vo,toz)  is 
not  directly  observable.  To  estimate  this  5-tuple  we  need  estimates  of  rn  ■  n  at  five  or  more 
image  points. 


5.2  Estimating  Tool  Motion  from  Normal  Flow 

If  we  use  the  spatial  image  gradient  as  the  normal  direction  nr  =  VI/||VJ||  =  nxi  +  nyj  and 
rn  «  un  we  can  obtain  an  approximate  equation  corresponding  to  (22)  by  replacing  the  left  hand 
side  of  (22)  by  the  normal  flow  —  /t/j|V/||.  This  equation  involves  the  eight  unknown  elements 
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of  c.  For  each  point  (x,-,i/j),  i  =  of  the  image  at  which  ||  V/(xt-, y,-, t)\\  is  large  we  can 

write  one  such  equation.  If  we  have  ra  such  points,  where  ra  ^  5,  we  have  an  over-determined 
system  of  equations 

$(M)'(^o^o^f~b  (26) 

where  the  rax  3  matrix  function  $  is  given  by 

(i.e.,  its  columns  are  ra-vectors  that  correspond  to  values  of  4>  at  points  (x,,  y,)),  and  the  elements 
of  the  ra-vector  b  are  —(dI(xi,yi,t)/dt)/\\VI(xi,yi,t)\\. 

We  seek  the  solution  of  the  system  (26)  for  which  ||b  —  $  •  (Uo  Vq  w^)r||  is  minimal  i.e., 
we  are  seeking  the  solution  of  (26)  in  the  least  squares  sense.  This  is  a  separable  nonlinear  least 
squares  problem;  a  good  stable  solution  and  an  algorithm  were  given  by  Golub  and  Pereyra  in 
[10].  It  was  shown  that  the  problem  is  equivalent  to  minimizing 

r(p, q)  =  ||b  -  $(p,  q)$+(p, ?)b||,  (27) 

where  <&+  is  the  generalized  inverse  of  $.  r(p,  q)  is  first  minimized  to  obtain  optimal  values  p 
and  q  of  p  and  q  respectively;  these  values  are  then  used  to  obtain  $(p,  q).  The  linear  least 
squares  method  is  then  used  to  minimize  ||b  —  4>(p,  q)  •  (Uo  Vo  w2)T||  and  obtain  optimal  values 
of  the  motion  parameters  Uo,  Vo,  and  coz.  After  estimating  p,  q,  Uo,  Vo,  and  loz  we  use  (19)  to 
obtain  Wo.  Finally,  we  obtain  N  =  (p  q  1)T(1  +  p2  +  q2)~ *  and  ||w||  =  +  p2lo^  +  q2oJ2. 

We  have  estimated  the  translational  velocity  T  and  the  rotational  velocity  u  in  the  camera 
coordinate  system  Oxyz.  We  are  interested  in  the  translational  and  rotational  velocity  expressed 
in  the  Frenet-Serret  frame  Otnb.  By  comparing  equations  (2),  (8)  and  (9)  we  obtain 

C3  =  v/cb,  b  =  Nsgnu)z,  vn  =  ||u;||  (28) 


where  sgn  stands  for  the  ‘sign  of’  function.  Also,  from  (2),  (8)  and  (9)  we  have 
(Uo  Vo  W0)T  =  Zo'T  =  Zo'd  -Cox  dc)  =  Z^(v  t  -  u  x  d7) 

and  thus 

ut  Tir  ,T  u  x  cL 

—  =  (U0  W,  Wo)T  +  -^. 

Note  that  in  equation  (29)  the  quantities  Zq  and  (the  position  of  the  point  C,  the  origin  of 
the  Otnb  frame)  are  not  known.  However,  let  <iy  =  ( Xc  Yc  ZC)T  be  the  position  of  C  and  let 
(xc,  yc )  be  the  image  of  C  (either  the  tip  or  the  center  of  mass  of  the  tool).  From  (18)  we  obtain 


(29) 


fZo 


=  /  +  pxc  +  qyc 


so  that  (29)  can  be  written  as 


vi 

Z~o 


—  (Uo  Wo  ^o)T  +  ^x 


nxc  yc  zcy 


=  (Uo  Wo  WoY  + 


T  ,  U  X  (Xc  Vc  fY 


f  +  pxc  +  qyc 


(30) 
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From  (30)  we  obtain  the  unit  vector  in  the  tangent  direction  t  by  normalizing  vt/Zo.  Finally, 
we  obtain  the  unit  vector  in  the  normal  direction  using 

fi  =  b  x  t.  (31) 

Equations  (28),  (30)  and  (31)  define  the  Frenet-Serret  frame  Otnb  expressed  in  the  camera 
coordinate  system.  Equation  (28)  gives  us  the  curvature  k  up  to  an  unknown  factor  v  (linear 
velocity).  We  conclude  that  the  Frenet-Serret  motion  can  be  recovered  up  to  the  speed  v;  note 
that  the  translational  velocity  vt/Z0  does  not  help  here  because  of  the  unknown  depth  Z0. 

Finally,  we  need  to  recover  the  orientation  of  the  tool  coordinate  frame  (its  long  and  short 
axes)  in  the  Otnb  frame.  We  find  the  long  and  the  short  axes  of  the  tool  as  the  principal  axes 
of  the  set  of  tool  points.  The  long  axis  l  of  the  tool  and  the  origin  0  of  the  fixed  (camera) 
coordinate  frame  Oxyz  define  a  plane  13;.  Since  the  image  l'  of  l  lies  in  this  plane  we  can  find 
Pil  using  V  in  place  of  /.  Because  we  have  assumed  a  nondegenerate  view  we  have  two  cases: 
(i)  if  the  tangent  vector  t  lies  in  II;  the  motion  is  along  /;  (ii)  if  the  normal  vector  n  lies  in  II; 
the  motion  is  orthogonal  to  l. 

We  check  if  the  vector  lies  in  the  plane  II;  using  the  following  simple  algorithm.  Let  p1  = 
(xj  yj  f)T  and  p2  =  {x-2  V2  f)T  be  the  positions  of  two  endpoints  on  the  image  l'  of  l.  The 
normal  Nn  of  the  plane  II;  is  given  by 

Nn  =  Pi  x  p2. 

If  the  vector  t  lies  in  the  plane  II;  we  have  Nn  xtfsO.  So  to  find  out  the  relative  orientation  of 
the  tool  frame  and  the  Otnb  frame  we  only  need  to  find  which  one  of  the  inner  products  |iVn  •  t| 

— f 

and  |jVn  •  n|  is  smaller.  (Note  that  while  one  of  the  vectors  t  and  n  lies  in  the  plane  II;  the  other 
vector  is  not  always  orthogonal  to  II;  .) 


6  Vehicle  Motion 

We  assume  that  the  motion  of  the  vehicle  is  planar  and  that  it  has  a  small  rotational  velocity 
around  the  axis  orthogonal  to  the  plane  of  motion.  The  translational  velocity  is  dominant  and 
at  any  time  t  the  motion  can  be  approximated  by  pure  translational  motion. 


6.1  The  Image  Motion  Field  of  a  Moving  Vehicle 

From  (14-15)  we  obtain  the  (approximate)  equations  of  projected  motion  for  points  on  a  vehicle 
under  weak  perspective: 


x 

it 


Uf-xW 

Zc  ’ 

(32) 

Vf-yW 

~zc  ’ 

(33) 
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Equations  (32-33)  relate  the  image  (projected)  motion  field  to  the  scaled  translational  velocity 
Z^f  =  Z;X(U  V  Wf. 

Given  the  point  r  =  xi  +  yj  and  the  normal  direction  nxi  +  nyj,  from  (32-33)  the  normal 
motion  field  rn  •  n  =  nxx  +  nyy  is  given  by 

i-n  =  nJUZ;1  +  nyfVZ;1  -  ( nxx  +  nvy)WZ, 71  (34) 


( «i  \ 

(  nxf  > 

Cl  \ 

uz -1  \ 

a  = 

a2  = 

nyf 

,  c  = 

C2 

= 

VZf1 

\  °3  / 

-nxx  -  nyy  ) 

V  C3  ) 

\  wz->  J 

Using  (35)  we  can  write  (34)  as  ?n  •  n  =  arc.  The  column  vector  a  is  formed  of  observable 
quantities  only,  while  each  element  of  the  column  vector  c  contains  quantities  which  are  not 
directly  observable  from  the  images.  To  estimate  c  we  need  estimates  of  rn  ■  n  at  three  or  more 
image  points. 


6.2  Estimating  Vehicle  Motion  from  Normal  Flow 


As  in  Section  5.2  we  use  linear  least  squares  to  estimate  parameter  vector  c  from  the  normal 
flow. 


In  the  case  of  a  moving  vehicle  the  parameters  of  interest  are  the  vehicle’s  trajectory  and  its 
rate  of  approach.  The  rate  of  approach 


(measured  in  sec  *)  is  equivalent  to  the  inverse  of  the  time  to  collision  and  corresponds  to  the 
rate  with  which  an  object  is  approaching  the  camera  (or  receding  from  it).  The  rate  v  =  0.1/sec 
means  that  every  second  the  object  travels  0.1  of  the  distance  between  the  observer  and  its 
current  position.  A  negative  rate  of  approach  means  that  the  object  is  going  away  from  the 
camera. 


The  direction  of  motion  c  =  TjZc  gives  us  the  tangent  vector  t  =  c/||c||  of  the  Frenet-Serret 
frame.  If  the  direction  of  motion  changes  over  time  we  can  use  the  Frenet-Serret  formulas  (5) 
to  recover  the  (scaled)  curvature  vk  of  the  trajectory.  Given  the  tangent  direction  to  at  time  t 
and  the  tangent  direction  ti  at  time  t  -f  A t  we  have 


n0  =  u/cn  ~ 


■4 

1 1  ~  tp 

At  ' 


(36) 


The  unit  vector  in  the  direction  n0  at  time  t  is  the  normal  vector  of  the  Otnb  frame  and  the 
scaled  curvature  is  given  by  vk  =  ||no||.  Finally,  we  obtain 


b  =  t  x  n.  (37) 

Equations  (36)  and  (37)  give  us  the  normal  b  to  the  plane  of  motion  and  the  rotational  velocity 

— * 

of  turning  (yaw)  u  =  u/cb. 
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Figure  3:  An  experiment  using  a  wrench:  (a-h)  frames  30,  40,  ...,  100.  Top  images:  the  input 
images.  Bottom  images:  results  of  flow  computation. 

7  Experiments 

In  the  following  section  we  show  two  examples  for  each  of  the  domains  we  have  discussed:  tools 
and  vehicles.  As  was  mentioned  before,  tools  usually  operate  by  planar  motion,  advancing  along 
a  line  (drill)  or  moving  in  a  plane  (sawing).  In  our  examples  we  show''  two  types  of  motion: 
rotation  with  negligible  translation,  and  relatively  small  rotation  and  dominant  translation.  In 
Section  7.1  we  will  analyze  saw  and  wrench  examples. 

A  ground  vehicle's  motion  usually  takes  place  on  terrain  that  has  a  small  slope  and  on  a  road 
with  a  limited  rate  of  turn.  This  results  in  small  values  of  pitch  and  yaw,  i.e.  in  locally  planar, 
translational  motion.  Long  sequences  are  needed  to  detect  basic  maneuvers  such  as  turning  or 
lane  changing.  In  Section  7.2  we  analyze  two  examples:  an  accelerating  van  (essentially  linear 
motion)  and  a  turning  taxi. 
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Figure  4:  Results  of  experiments  on  the  wrench  sequence:  the  graph  shows  rotational  velocity 
in  radians/sec. 

7.1  Motions  of  Tools 

We  tested  our  motion  analysis  algorithm  under  full  perspective  on  two  image  sequences  of  tools 
in  motion.  The  first  sequence,  shown  in  Figure  3,  was  a  200-image  sequence  of  the  movement 
of  a  wrench  tightening  a  bolt. 

The  motion  of  the  wrench  was  a  rotation  (to  turn  the  bolt)  around  an  axis  approximately 
orthogonal  to  the  plane  of  the  image.  The  rotational  velocity  is  shown  in  Figure  4;  it  is  given  in 
radians/sec  and  it  corresponds  to  the  scaled  curvature  vk.  Figure  5  shows  the  orientation  of  the 
principal  axis  of  the  wrench  and  the  instantaneous  translational  velocity  vector  of  its  centroid 
(obtained  using  equation  (30)),  both  measured  in  radians.  As  we  see,  the  translational  velocity 
vector  remains  approximately  orthogonal  to  the  principal  axis  throughout  the  motion  sequence. 
The  Frenet-Serret  frame  has  its  binormal  b  in  the  direction  of  the  negative  2-axis,  its  tangent  t 
in  the  image  plane  and  orthogonal  to  the  principal  axis  of  the  wrench,  and  its  normal  ii  in  the 
image  plane  and  oriented  from  the  centroid  of  the  wrench  toward  the  bolt. 

We  also  tested  our  motion  analysis  algorithm  on  a  200-image  sequence  of  a  saw  doing  a 
periodic  motion.  Figure  6  presents  part  of  the  sequence.  Flow  results  are  given  below  each 
image.  The  motion  of  the  saw  was  pure  translation  (||u;||  =  0).  As  can  be  seen  from  Figure  7 
the  motion  is  mostly  fronto-parallel  (the  2  component  of  the  translational  velocity  is  small). 
The  motion  is  periodic  in  the  direction  of  the  principal  axis  of  inertia.  It  is  a  simple  case  of  a 
(periodic)  straight-line  motion  with  the  Frenet-Serret  frame  corresponding  to  the  principal  axes 
of  the  saw;  t  corresponds  to  the  longest  axis,  and  b  to  the  shortest  axis. 

These  graphs  show  that  the  motion  components  have  a  simple  behavior;  before  they  reach 
their  extremal  values  they  can  be  approximated  by  straight  lines,  indicating  constant  relative 
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Figure  5:  Results  of  experiments  on  the  wrench  sequence.  The  solid  line  corresponds  to  the 
orientation  (in  radians)  of  the  instantaneous  direction  of  translation  of  the  centroid  of  the  wrench, 
and  the  dashed  line  corresponds  to  the  orientation  (in  radians)  of  the  principal  axis  of  the  wrench. 

accelerations. 


7.2  Motions  of  Vehicles 

For  vehicle  motion  we  also  used  two  image  sequences,  and  we  used  the  algorithms  for  weak 
perspective.  In  the  first  experiment  we  used  an  image  sequence  of  a  van  taken  from  another 
vehicle  following  the  van.  The  sequence  consisted  of  56  frames  (slightly  less  than  two  seconds). 
Figure  9  shows  frames  5,  15,  25,  and  35  as  well  as  the  corresponding  normal  flow  on  the  van. 
Figure  10  shows  estimated  values  of  UZ~l ,  V Z-1 ,  and  W Z~x .  These  values  correspond  to 
the  relative  translation  of  the  van  and  the  vehicle  carrying  the  camera  (observer  coordinate 
system).  Because  of  our  choice  of  the  coordinate  system  the  rate  of  approach  v  corresponds  to 
the  negative  of  WZ -1,  i.e.  v  =  — VFZ-1.  The  graph  shows  that  there  is  an  impending  collision 
(rate  of  approach  greater  than  1  sec-1).  Around  the  20th  frame  the  rate  of  approach  becomes 
zero  (as  do  all  the  velocity  components)  and  after  that  it  becomes  negative  because  the  van 
starts  pulling  away  from  the  vehicle  carrying  the  camera. 

In  the  second  experiment  we  used  an  image  sequence  of  a  turning  taxi  taken  by  a  stationary 
camera.  The  sequence  consisted  of  21  frames.  Figure  11  shows  frames  1,9,15  and  21  as  well 
as  the  corresponding  normal  flow  on  the  vehicles.  Figure  12  shows  estimated  values  of  UZ -1, 
V Z-1 ,  and  W Z-1.  These  values  correspond  to  the  relative  translation  of  the  taxi.  The  graph 
shows  that  there  is  a  large  W  component  in  the  turn  (the  taxi  is  receding),  and  that  the  turn  is 
to  the  right  (negative  U,  positive  V ). 


14 


(e)  (f)  (g)  (h) 


Figure  6:  An  experiment  using  a  saw:  (a-h)  frames  30,  40,  ...,  100.  Top  images:  the  input 
images.  Bottom  images:  results  of  flow  computation. 

8  Conclusions 

Many  types  of  common  objects,  such  as  tools  and  vehicles,  usually  move  in  simple  ways  when 
they  are  wielded  or  driven:  The  natural  axes  of  the  object  tend  to  remain  aligned  with  the  local 
trihedron  defined  by  the  object’s  trajectory.  In  this  paper  we  have  considered  the  relationship 
between  this  constrained  motion  and  the  object’s  geometry.  To  analyze  this  relationship  we 
have  used  two  frames:  the  object  frame  and  the  frame  of  the  motion  trajectory.  Assuming  a 
constant  relationship  between  the  object  frame  and  the  motion  frame  during  the  motion,  we 
have  used  Frenet-Serret  motion  as  a  motion  model.  Using  the  Frenet-Serret  frame  has  provided 
us  with  an  ability  to  understand  motion  over  a  long  time  period. 

We  have  derived  equations  for  understanding  the  motions  of  tools  and  vehicles  under  full 
and  weak  perspective.  We  have  recovered  descriptions  of  an  object’s  motion  and  the  space 
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Figure  7:  Results  of  experiments  on  the  saw  sequence.  U,  V ,  W  are  the  scaled  (by  an  unknown 
distance  Zf1)  components  of  the  relative  translational  velocity. 


curve  along  which  the  object  moves,  using  relatively  long  image  sequences.  The  motion  and 
trajectory  parameters  provide  a  low-level  description  for  understanding  the  motions  of  vehicles. 
For  understanding  tools  in  motion  one  needs  additional  knowledge  about  the  tool  and  the 
context.  This  is  a  direction  for  further  research. 

It  is  the  need  for  efficient  force  transfer  that  imposes  a  simple  and  constant  relationship 
between  the  natural  axes  of  the  object  and  the  motion  trajectory.  We  have  used  this  functional 
constraint  in  analyzing  the  motions  of  tools  and  ground  vehicles.  Expanding  this  analysis  to 
other  classes  of  objects  (e.g.  air  vehicles),  as  well  as  expanding  the  vocabulary  that  describes 
the  behavior  of  tools  and  vehicles  (sharp  turn,  skid,  etc.)  [13]  are  other  directions  for  future 
research. 
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Figure  10:  Results  of  experiments  on  the  van  sequence.  U,  V',  W  are  the  scaled  (by  an  unknown 
distance  Z T1)  components  of  the  relative  translational  velocity  . 


Figure  11:  A  taxi  sequence:  (a-d)  frames  1,  9,  15,  21.  Top  images:  the  input  images.  Bottom 
images:  results  of  flow  computation. 


Figure  12:  Results  of  experiments  on  the  taxi  sequence.  U ,  V ,  W  are  the  scaled  (by  an  unknown 
distance  Z'1)  components  of  the  relative  translational  velocity  . 
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